*****************************************************************
Version: May 2025
If you use this dataset, please cite the following paper:
"Man versus Machine Learning Revisited” (2025)
SSRN link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4899584
*****************************************************************


Look-ahead-Bias Free Earnings Forecast Dataset


Variables:


- permno: stock identifier, the CRSP Permanent Company Number
- YearMonth: The last day of each month (YYYY-MM-DD) 
- RF_q1: Random forest earnings forecast for one-quarter-ahead EPS.
- RF_q2: Random forest earnings forecast for two-quarter-ahead EPS.
- RF_q3: Random forest earnings forecast for three-quarter-ahead EPS.
- RF_y1: Random forest earnings forecast for one-year-ahead EPS.
- RF_y2: Random forest earnings forecast for two-year-ahead EPS.


Additionally, we provide forecasts from other machine learning models, including OLS, partial least squares, LASSO, elastic net, random forest, and LightGBM, and a composite benchmark that averages these forecasts. Variables are named in the format 'XX_EPS_HH', where:
- 'XX': Model abbreviation (OLS, PLS, LASSO, ENet, RF, LGBM, Composite)
- 'HH':  Forecasting horizon (Q1, Q2, Q3, Y1, Y2)
For example, OLS_EPS_Q1 represents the OLS forecast for one-quarter-ahead EPS. 


Note: All forecasts are generated at the end of month t. Please see the paper for more details. This dataset is prepared by Yandi Zhu (e-mail: yandi.zhu@stu.pku.edu.cn).
